Projectivity in Totally Ordered Rooted Trees: An Alternative Definition of Projectivity and Optimal Algorithms for Detecting Non-Projective Edges and Projectivizing Totally Ordered Rooted Trees
نویسنده
چکیده
This paper discusses the notion of projectivity and algorithms for projectivizing and detecting non-projective edges in totally ordered rooted trees (such trees are used in dependency syntax analysis of natural language, where they are called dependency trees). In the first part, we review the notion of projectivity, then we present a new definition inspired by the algorithmic inquiry and show its equivalence with the classical definitions. We define the canonical projectivization of a totally ordered rooted tree (preserving the tree structure and the relative ordering for all inner nodes and their immediate dependents) and show its uniqueness; we also give a generalization of this result. We then discuss some properties of non-projective edges relevant for the algorithms presented in the following section. In the second part, we present a data representation of totally ordered rooted trees and algorithms for this data representation. The first algorithm computes the projectivization of the input tree, the second algorithm detects non-projective edges of certain types in the input tree (we also give a hint on finding all non-projective edges using its output). Both algorithms can be used for checking projectivity. We prove that the algorithms are optimal: they have time complexities O(n). Furthermore, they can be straightforwardly combined into a single algorithm, preserving the time complexity. 1 Projectivity in Totally Ordered Rooted Trees This section discusses the condition of projectivity in totally ordered rooted trees. First we give a definition of a totally ordered rooted tree and introduce some notation, then we present the classical definition of projectivity and introduce a new one showing its equivalence, we define the notion of projectivization and show its uniqueness, and finally we divide non-projective edges into three classes and discuss their relationships. 1.1 Totally Ordered Rooted Trees We give a definition of totally ordered rooted trees, without proofs briefly review some basic properties of rooted trees, and introduce the notation used in this paper. 1.1.1 Definition A totally ordered rooted tree is a quadruple (V,E,r,≤), where (V,E,r) is a rooted tree (V being the finite set of vertices (or nodes), E the set of edges (unordered pairs of nodes), and r ∈V the root) and ≤ a linear ordering on V . (A totally ordered rooted tree is often called a dependency tree.) In a rooted tree (V,E,r), there is a unique path from the root r to every node a, say x0 = r, x1, . . . , xn = a, n ≥ 0, where {xi,xi+1} ∈ E for 0 ≤ i < n. Therefore every node a has a uniquely defined level equal to the length of the path connecting it with the root, i.e. n, which we will denote lev(a). For every node a $= r, we will call b= xn−1 the parent of a (with notation a→ b; we will also say that a is a child of b or that a depends on b). A node with no children is called a leaf, a node which is not a leaf is an internal node. Obviously, in a rooted tree there is a one-to-one correspondence between the edges and nodes different from the root (edges correspond uniquely to their “lower” nodes). Nodes with the same parent are called siblings. The height of a rooted tree is the maximal level occurring in it. When talking about the tree structure, we will use “vertical-axis” terms such as “above”, “below”, “upper”, “lower” etc., with the root being the highest and the other nodes ordered downwards reversely with respect to their level. (Rooted trees are usually drawn “upside-down” with root at the top and other nodes according to their level downwards, with nodes at the same level in the tree drawn on the same horizontal line.) The reflexive transitive closure of the relation → will be denoted !; for a! c we will say that c is an ancestor of a, or a is a descendant of c, or that a is subordinated to c. (Note that the relation of dependency→ is irreflexive, whereas the relation of subordination ! is defined as reflexive.) For every node a of a rooted tree T = (V,E,r) we call the tree Ta = (Va,Ea,a), where Va = {x ∈V | x! a}, Ea = {{x,y} ∈ E | x,y ∈Va}, the subtree of T rooted in node a. When talking about the linear ordering ≤ on nodes of a totally ordered rooted tree (V,E,r,≤), we will use the usual notation a ≥ b meaning b ≤ a, and a < b meaning a ≤ b and a $= b (and similarly for >); we will also be using “horizontal-axis” terms such as “left”, “right”, “in between” etc. with the obvious meaning (we will say that a is to the left from b when a< b, etc.). When drawing totally ordered rooted trees, we accept the following conventions: Nodes are drawn top-down according to their level, with nodes on the same level on the same horizontal line, with the root at the top; nodes are drawn from left to right according to the linear ordering on nodes. Edges are drawn as solid lines. For an edge a→ b of a totally ordered rooted tree T = (V,E,r,≤), we call the interval in the linear ordering delimited by the nodes a and b the span of the edge a→ b. Please note that the notion of a totally ordered rooted tree (cf. Definition 1.1.1) differs from the notion of an ordered rooted tree, where for every internal node only a linear ordering of its children is given (i.e. the ordering is not total, it is specified only for sibling nodes). Here we are concerned with rooted trees with a total linear ordering on their nodes. For the sake of brevity of the definitions in the following section we introduce two predicates: • A ternary predicate representing the “strictly in between” relation: Inb(x,u,v) df = (u< x & x< v)∨ (v< x & x< u) . (Obviously, Inb(x,u,v) should be read as “x lies (strictly) between u and v”.) • A ternary predicate representing the “being siblings” relation: Sibl(u,v,b) df = (u→ b & v→ b & u $= v) . (Sibl(u,v,b) should be read as “u and v are different children of their common parent b”.) We will be taking advantage of the fact that both predicates are symmetric in two of their arguments (Inb in its second and third arguments, Sibl in its first and second arguments). 1.2 Condition of Projectivity for Totally Ordered Rooted Trees We begin by giving a definition of projectivity using three conditions proved to be equivalent by Marcus (1965) (we take over their denotation), and then present a new condition and prove that it is equivalent to one of the classical ones. 1.2.1 Definition (Marcus (1965)) A totally ordered rooted tree T = (V,E,r,≤) is projective if the following equivalent conditions hold: (H-H) (∀a,b,x ∈V ) ( a→ b & Inb(x,a,b) =⇒ x! b ) , Figure 1: A sample projective tree Figure 2: A sample non-projective tree (L-I) (∀a,b,x ∈V ) ( a! b & Inb(x,a,b) =⇒ x! b ) , (F) (∀u,v,b,x ∈V ) ( u! b & v! b & Inb(x,u,v) =⇒ x! b ) . A totally ordered rooted tree not satisfying the conditions is called non-projective. (See Figures 1 and 2 for examples of projective and non-projective totally ordered rooted trees, respectively.) We will not repeat here the proof of the equivalence of the three conditions in Definition 1.2.1, it is quite straightforward and relies on the simple fact that for every two nodes in the relation of subordination there exists a unique finite path between them formed by edges of the rooted tree. All three conditions in Definition 1.2.1 have in common the following: in a configuration where two (or three) nodes have some structural relationship (i.e. a relationship via the tree structure) and there is a node x between them in the linear ordering, they predicate that the node x be in an analogous structural relationship. Condition (F) is perhaps most transparent as far as regards the structure of the whole tree. It says that every subtree of a projective tree must be contiguous in the linear ordering. A simple reformulation of the condition (F) gives the following condition of projectivity, which makes this point even more clear: 1 (F’) (∀u,v,b ∈V ) ( u! b & v! b=⇒ ¬(∃x ∈V )(Inb(x,u,v) & x $! b) ) . The condition (F’) leads naturally to the notion of a gap in the coverage of a subtree (a gap is the set of “extra-subtree” nodes in the span of the subtree, i.e. between any nodes of the subtree in the linear ordering). Such notion of a gap was used by Holan et al. (1998), who introduce measures of non-projectivity and present a class of dependency-based formal grammars allowing for a varying degree of word-order freedom; Holan et al. (2000) present linguistic considerations concerning Czech and English. In our study, however, we will be concerned with a different notion of a gap. 1The equivalence of conditions (F) and (F’) is straightforward, see the following first-order-logic reasoning: (∀u,v,b,x ∈V )(u! b & v! b & Inb(x,u,v) =⇒ x! b) ⇐⇒ (∀u,v,b,x ∈V )(u! b & v! b=⇒ (Inb(x,u,v) =⇒ x! b)) ⇐⇒ (∀u,v,b,x ∈V )(u! b & v! b=⇒ (¬Inb(x,u,v)∨ x! b)) ⇐⇒ (∀u,v,b,x ∈V )(u! b & v! b=⇒ ¬(Inb(x,u,v) & x $! b)) ⇐⇒ (∀u,v,b ∈V )(u! b & v! b=⇒ (∀x ∈V )(¬(Inb(x,u,v) & x $! b))) ⇐⇒ (∀u,v,b ∈V )(u! b & v! b=⇒ ¬(∃x ∈V )(Inb(x,u,v) & x $! b)). In a non-projective totally ordered rooted tree, there exists at least one edge a→ b and a node x not satisfying the condition (H-H). We will call such an edge a non-projective edge of the totally ordered rooted tree. The set Xa→b = {x ∈V | Inb(x,a,b) & x $! b} of all nodes causing the non-projectivity of the edge a→ b will be called the gap of the edge a→ b. Let us now present in the form of a theorem another condition which is equivalent to the conditions in Definition 1.2.1. 1.2.2 Theorem A totally ordered rooted tree T = (V,E,r,≤) is projective if and only if the following condition holds: (*) (∀a1,a2,b,u1,u2 ∈V ) ([ a1 → b & u1 ! a1 & ( [a2 = b & u2 = b]∨ [Sibl(a1,a2,b) & u2 ! a2] )] =⇒ [a1 < a2 ⇔ u1 < u2] )
منابع مشابه
Properties of Subtree-Prune-and-Regraft Operations on Totally-Ordered Phylogenetic Trees
We study some properties of subtree-prune-and-regraft (SPR) operations on leaflabelled rooted binary trees in which internal vertices are totally ordered. Since biological events occur with certain time ordering, sometimes such totally-ordered trees must be used to avoid possible contradictions in representing evolutionary histories of biological sequences. Compared to the case of plain leaf-la...
متن کاملLR-Drawings of Ordered Rooted Binary Trees and Near-Linear Area Drawings of Outerplanar Graphs
We study a family of algorithms, introduced by Chan [SODA 1999], for drawing ordered rooted binary trees. Any algorithm in this family (which we name an LR-algorithm) takes in input an ordered rooted binary tree T with a root rT , and recursively constructs drawings ΓL of the left subtree L of rT and ΓR of the right subtree R of rT ; then either it applies the left rule, i.e., it places ΓL one ...
متن کاملSome Notes on Trees and Paths
These notes cover background material on trees which are used in the paper [1]. 1. Trees and paths background information In the paper [1] it is shown that trees have an important role as the negligible sets of control theory, quite analogous to the null sets of Lebesgue integration. The trees considered are analytic objects in flavour, and not the finite combinatorial objects of undergraduate ...
متن کاملSimply generated trees and conditioned Galton–Watson trees
The trees that we consider are rooted and ordered (= plane); thus each node v has a number of children, ordered in a sequence v1, . . . , vd, where d = d(v) ≥ 0 is the outdegree of v. (See [1] for more information on these and other types of trees; the trees we consider are there called planted plane trees.) We let Tn denote the set of all ordered rooted trees with n nodes (including the root) ...
متن کاملThe Boundary of Ordered Trees
In this paper we compute the distribution of several statistics on the set of rooted ordered trees. In particular, we determine the number of boundary edges, the number of singleton boundary edges, and the analogous values when edges may take on one of k colors.
متن کامل